Unix shell for demo coders
brioche/aspirine

00 - Table of Contents

	     01 - Introduction
	02 - Finding Files
	03 - Getting Information about a File
	04 - Batch Processing
	05 - On-the-fly Text Editing
	06 - The AWK Pattern Matching Language
	07 - I'm the Operator with my Pocket Calculator
	08 - Quick tricks
	09 - Closing Words
	0A - References
	0B - Acknowledgments

01 - Introduction

Although there are more and more Linux users every day, there are not that many demos for Linux. This is probably due to the lack of attendance but how do you want a real scene to grow if no one writes demos for Linux? Or at least demos that are portable to Linux...

But even if you don't plan to release your demo for Linux, you may still consider using it as a development environment for its performances, its reliability and the wide range of powerful programming tools freely available.

This is definitely NOT a tutorial to the Unix shell, I'll assume that you already know some commands (like cat, ls, grep, ...) and how to perform some basic data manipulation with the shell (like piping and redirections). What we'll discuss here is how the shell can help you in the development of your demos. I'll also assume that your using the Bourne (Again) Shell (sh or bash).

This is not a complete guide either (there's no point copying the manpages!), it is just a bunch of hints to point you to the tools that maybe you have been searching for...

Note: a lot of windozes were hurted during the making of this article. :)

02 - Finding Files

In Linux you have basically two commands to find files: find and locate. Ok, I agree: 'ls' works too... :)

find is the good old Unix command and locate is a faster search tool using a database which is updated by a crontab'ed find-based script.

If you want to list all the C++ sources in your home directory, just type:

	     $ find ~ -name '*.cc' -print

In this case you must protect the * metacharacter within quotes so it is not being interpreted by the globber. The globber is the component of the shell that does all the job about command line wildcard argument substitution.

There's a special parameter in find to execute a command for each item found. If you want to remove all the code dumped in your personnal src/ directory:

	     $ find ~/src -name 'core' -exec rm {} \;

find substitues the {} by the matched path of the file. find also excepts a ';' to determine the end of the command line to be -exec'ed, we just escape the special character ';' to prevent the shell from interpreting it as a command separator and thus passing it to find as an argument.

You can combine several queries with the '-o' (logical or) operator.

The 'locate' utility is hell a lot faster than find but hasn't such a lot of features. It only excepts a pattern (see manpage for pattern matching specifications) and searches the file database for matching.

To list all the files containing a pattern and residing in a subtree of the file system:

	     $ grep -l the_pattern `find path/to/a/directory -print`

03 - Getting Information About a File

if you want to take a quick look to a text file with having to load it with a text editor, you should give a try to the commands head and tail.

This one prints the first 6 lines of a C++ header.

	     $ head -n 6 LUT.h

	#include <iostream>
	#include <cmath>

	template <class T> class Lookup_Table {
	public:
	$ ...

And this one shows the 5 last lines of code that contains the word 'eax'.

	     $ cat blit_x86.asm | grep eax | tail -n 5
		mov   eax, [frame_buffer]
		and   eax, 0x7f7f7f7f
		or    ebx, edx	     ;pitch is stored in eax at this time
	$ ...

If you just want to know which files contains a pattern use the -l switch of grep:

	     $ grep -l class *
	misc.h
	misc.cc
	readme.txt
	$ ...

Ok, we've seen that it's fairly easy to find information in text files but what about binary files? Linux has a magic tool called 'file' that scans file header and attempts to find out what is actually in the file. This tool is very powerful, you want to know the resolution of a picture? Just type in the following command:

	$ file texture.png
   texture.png: PNG image data, 256 x 256, 8-bit/color RGB, non-interlaced
   $ ...

Ain't that really cool? This works for most picture and animation files.

You can also take a look in a binary using the od (octal dump) utility.

To count lines, words or characters in one or more text files, use wc. The following command line counts the number of lines of code of the example project:

	     $ wc -l $.{cc,h}
	     28 adapter.cc
	     65 dynamic_linker.cc
	     64 main.cc
	     48 plugin1.cc
	     50 plugin2.cc
	     50 proxy.cc
	     54 adapter.h
	     43 dynamic_linker.h
	     49 exception.h
	     34 plugin.h
	     34 plugin1.h
	     34 plugin2.h
	     41 proxy.h
	    594 total
	$ ...

The ouput of wc would have been different, had I written this:

	     $ cat *.{cc,h} | wc -l
	    594
	$ ...

This is maybe what you will want in some cases. This combination might be used to cound how many files are in directory:

	     $ ls *.cc | wc -l
	      6
	$ ...

04 - Batch Processing

Any good multitasking operating system allows you to perform non-interactive processing as background jobs. To run a program in background put an '&' at then end of the command line, type 'jobs' to see what jobs are running and type 'fg' to get it back in foreground. If you have several jobs running, you may identify them as %job-id (see the output of the jobs command).

There are lots of useful things to do as background jobs:

- converting/processing files (graphics, 3D scenes, etc)
- generating static tables/textures/...
- compressing data ([bg]zipping files)
- compiling programs
- ...

When doing such jobs, you often have to deal with a set of files, let's check out how to optimize this with some groovy shell scripts.

If you wanna do batch image processing, you'll have to install the ImageMagick package, just for one thing: the 'convert' utility. It allows you to perform advanced manipulation on pictures using command line arguments. In the following example, we'll convert all the BMP pictures in the directory into a raw 24-bit RGB format and then compress them with the generic LZ77 algorithm.

	     #!/bin/sh

	# this is the extension of the file to be converted
	ext=`bmp`

	for $src in *.$ext ; do
		dest=`basename $ext`
		convert $src rgb:${dest}.rgb
		gzip -9 ${dest}.rgb
	done

This will create some *.rgb.gz files easy to decompress within your C or C++ programs using the zlib. I wrote a handy C++ facade to zlib decompression routines, you should find it at aspirine website... or ask Dario Phong! :)

05 - On-the-fly Text Editing

All Unix systems have a set of very powerfull tools which may be used to edit streams of characters.

The tr (translate) command allows you to delete, squeeze or transpose characters.

The following command squeezes all the sequences of spaces to a single space in the file foo.

	     $ tr -s ' ' a_file_with_lots_of_spaces

And this one turns the whole file to upper case:

	     $ tr [a-z] [A-Z] a_lower_case_text_file

Check out the manpage for further information about this resourceful tool.

I already heard lots of sceners saying that C++ sucks because you have to write truckloads of text (as such as class definition) before really starting to write a single line of effective code. Although I'm one of those C++ enthusiasts, I have to admit that they're not totally wrong... but it's possible to speed things up!

Let's say you have somewhere a template source file looking like this:

	     // -*- C++ -*- (for X?emacs buddies only!)
	// $Id: shell.txt,v 1.1 2000/04/15 20:44:48 alcibiade Exp $
	//
	// <ClassName>.h
	//
	// Copyleft 1999-2000, brioche of aspirine (brioche@linuxbe.org)
	//

	#ifndef __<ClassName>_h_included
	#define __<ClassName>_h_included

	class <ClassName> {
	public:
		<ClassName>();
		<ClassName>( const <ClassName> &other );
		~<ClassName>();
		const <ClassName>& operator = ( const <ClassName> &other );
		// add whatever you find useful here
	};

	#endif // __<ClassName>_h_included

Now, if we want to replace all <ClassName> patterns by the name of the class to be written. Check out the tiny 'mkh' script which does the good job:

	     #!/bin/sh
	name=$1
	cat template.h | sed -e '%s/<ClassName>/$name' > ${name}.h

To create the C++ header file from the shell, we simply write:

	     $ mkh Texture_Warper

And you'll have all the skeleton of your code generated in Texture_Warper.h. Of course you may also generate the skeleton code of the .cc file!

06 - The AWK Pattern Matching Language

AWK is the ancestor of PERL and is one of the most powerful shell tool available for Unix. The basic task of an AWK program is to scan a stream of character, process information and print the result on the standard output stream. AWK supports several types of pattern matching conditions, just check out the man pages or nice the documentation of GNU AWK written by Bob Withers.

What does the AWK interpreter exactly do? It just apply all the "rules" you defined (according to their pre-requisites) on to the current line of text or record (field and record separators may be changed).

For instance, if you have a text file where records are grouped in line and field are comma separated, you parse it as follows:

	     # Change FS (Field Separator, blank characters by default)
	BEGIN {
		FS = ","
		cnt = 0
	}

	# Print only records when the 2nd field is != 0
	$2 != 0 {
		print $0
		cnt++
	}

	END {
		print cnt, "record(s) printed"
	}

The special patterns BEGIN and END are respectively executed before and after the core processing of the input.

One of the most used feature of AWK is associative arrays. Just as in PERL, you may use almost any data type as an index in a table. AWK manages hashtables for you.

	     # Count words in a text, process each line
	{
		# For each word, add one more occurence
		for ( i = 1; i <= NF; i++ )
			cnt[$i]++
	}

	# Print results
	END {
		# For all the words in the hashtable
		for ( word in cnt )
			printf "%s appears %d time(s)\n", word, cnt[word]
	}

AWK is damn powerful and its syntax looks nearly exactly like C while PERL's syntax just sucks, take a look at the following script, it converts a 3D-Studio ASCII mesh into an easier-to-read simple 3D object format.

#!/bin/gawk -f
#
#	A very simple Autodesk 3D-Studio .ASC mesh file reader/converter
#	Copyleft 1998/99 - brioche/aspirine <brioche@linuxbe.org>
#
#	Just feel free to do what you have to do with this piece of code
#

#
#	Check out command line arguments and initialize the right stuff
#

BEGIN {

	# set field separator regexp
	FS = "([ \t]+|[ \t]*:[ \t]*)"

	scale = 1.0
	map = 0
	header_ok = 0

	for ( arg = 1; arg < ARGC; arg++ ) {
		if ( ARGV[arg] ~ /-s|--scale/ ) {
			scale = ARGV[arg+1]
			delete ARGV[arg]
			delete ARGV[arg+1]
		}
	}

}

#
#	Print error message if we failed parsing the header
#

END {
	if ( !header_ok )
		printf "\nOoops! Error parsing header...\n"
}

#
#	Search for line starting with "Mapped" to check whether --
#	normalized mapping coordinates (u,v) are present in the script
#

/^Mapped/ {
	map = 1
}

#
#	Parse header to find out number of vertices and triangles
#

/Tri-mesh/ {
	nb_vert = $3
	nb_tri = $5
	printf "%d\n%d\n", nb_vert, nb_tri
	header_ok = 1
}

#
#	Parse 3D coordinates list
#

/^Vertex [0-9]+/ {
	if ( map )
		printf "%f %f %f %f %f\n", $4*scale, $6*scale, $8*scale, $10, $12
	else
		printf "%f %f %f 0.0 0.0\n", $4*scale, $6*scale, $8*scale
}

#
#	Parse triangle vertex reference list
#

/^Face [0-9]+/ {
	printf "%d %d %d\n", $4, $6, $8
}

Ain't that really cool? Think about the time you'd have spent writing this (with the same level of code safety) in C/C++ and without ripping someone else's loader! :)

AWK has also lots of built-in functions for string manipulation/tokenizing and even maths. I already used AWK scripts to generate scenes for my raytracing engine, I told you it was really useful!

07 - I'm the Operator with my Pocket Calculator

The program 'bc' is an arbitrary precision calculator language which may be pretty complex but it works fine too for smaller computation.

With 'bc', you can easily do number conversion as follows:

	     $ bc
	obase=2
	193
	11000001
	obase=16
	27
	1B

Just type EOF (CTRL+D) to exit the calculator. Note that it is also possible to write function and iterations.

Take a look at the manpage and you'll see how many functions are available, this tool is great for building quick-n-dirty programs that perfom heavy calculations!

08 - Quick tricks

And now some miscellaneous tips and tricks.

You want to compile two modules without writing a Makefile? just use the && operator:

	     $ gcc -c module.c && gcc main.c module.o

The second part of the expression will be evaluated only if the first part is true (ie gcc's return status is 0). On the other hand, if you want to perform an action if another fails, use the || (logical or) operator:

	     $ test -x a_file || chmod u+x a_file

This line checks if a file is executable otherwise, it changes the permission for the user.

If you want to make a multiple move, you'll have to generate a sequence of 'mv' and pipe them into a shell since you can't write something handy like 'mv *.c *.c.old'. The following gorgeous command line will do what you want:

	     $ ls *.c | sed -e 's/.*\.c/mv & &.old/g' | sh

It substitutes all the files coming from the ls command and generate a command line like 'mv x x.old' (in sed, the & is the current matched pattern to be substituted) that we pipe into a shell to be executed. Yes, it works. :) Of course, it's usually a good idea to write this into a shell script or to use some special commands like 'mmv' (not standard on all Unices) to perform multiple moves.

If you need to go from one side of the filesystem to the other and that you want to spare time when you'll have to get back to your current directory, use your shell's built-in directory stack (in $DIRSTACK for bash):

	     $ pwd
	/usr/local/src/player/core/lib/linux-2.2.x/devel/current
	$ pushd .
	$ cd /home/jdoe/personnal/src/cpp/project2/src/include/opengl/
	... do something out there ...
	$ popd
	$ pwd
	/usr/local/src/player/core/lib/linux-2.2.x/devel/current

Believe me, this is helpful! :)

09 - Closing Words

Well, we've reached the end of this article. I hope that you're now all convinced that the Unix shell is a wonderful programming tool. Although it might be hard when you start (especially if you've never known the old DOS days) but believe me, it helps! It worked for me, now I just can't work in DOS anymore, I'm addicted to Xemacs and a couple of xterm's for debug output and shell operations!

And remember that: "Unix is not an operating system, it's an environment!"

0A - References

This book is a must-read for anyone interested in Unix:

	     Unix Power Tools, 2nd Edition
	Jerry Peek, Tim O'Reilly, Mike Loukides, et al.
	O'Reilly & Associates

And this one is a must-have for anyone using Unix:

	     Unix in a Nutshell
	Daniel Gilly.
	O'Reilly & Associates

Don't forget the man pages! To search the manpage database using a keyword you may use the commands apropos, whatis or man -k.

0B - Acknowledgments

Special thanks to gael (desnos/aspirine) for bringing me into the wonderful world of Linux.
Thanks to Bob Marley too... although he has nothing to do with Unix! =)
Big up to the Asian Dub Foundation for making huuuuuge gigs!
Huge respect to Fuzzion for keeping to old-skool design spirit alive.

brioche of the aspirine d-zign international demo terrorists

RCS - $Id: shell.txt,v 1.1 2000/04/15 20:44:48 alcibiade Exp $